Top-down particle filtering for Bayesian decision trees
نویسندگان
چکیده
Decision tree learning is a popular approach for classification and regression in machine learning and statistics, and Bayesian formulations— which introduce a prior distribution over decision trees, and formulate learning as posterior inference given data—have been shown to produce competitive performance. Unlike classic decision tree learning algorithms like ID3, C4.5 and CART, which work in a top-down manner, existing Bayesian algorithms produce an approximation to the posterior distribution by evolving a complete tree (or collection thereof) iteratively via local Monte Carlo modifications to the structure of the tree, e.g., using Markov chain Monte Carlo (MCMC). We present a sequential Monte Carlo (SMC) algorithm that instead works in a top-down manner, mimicking the behavior and speed of classic algorithms. We demonstrate empirically that our approach delivers accuracy comparable to the most popular MCMC method, but operates more than an order of magnitude faster, and thus represents a better computation-accuracy tradeoff.
منابع مشابه
Particle Gibbs for Bayesian Additive Regression Trees
Additive regression trees are flexible nonparametric models and popular off-the-shelf tools for real-world non-linear regression. In application domains, such as bioinformatics, where there is also demand for probabilistic predictions with measures of uncertainty, the Bayesian additive regression trees (BART) model, introduced by Chipman et al. (2010), is increasingly popular. As data sets have...
متن کاملDecision Trees and Forests: A Probabilistic Perspective
Decision trees and ensembles of decision trees are very popular in machine learning and often achieve state-of-the-art performance on black-box prediction tasks. However, popular variants such as C4.5, CART, boosted trees and random forests lack a probabilistic interpretation since they usually just specify an algorithm for training a model. We take a probabilistic approach where we cast the de...
متن کاملAn Evolutionary Algorithm for Global Induction of Regression Trees with Multivariate Linear Models
In the paper we present a new evolutionary algorithm for induction of regression trees. In contrast to the typical top-down approaches it globally searches for the best tree structure, tests at internal nodes and models at the leaves. The general structure of proposed solution follows a framework of evolutionary algorithms with an unstructured population and a generational selection. Specialize...
متن کاملMachine learning in prognosis of the femoral neck fracture recovery
We compare the performance of several machine learning algorithms in the problem of prognostics of the femoral neck fracture recovery: the K-nearest neighbours algorithm, the semi-naive Bayesian classifier, backpropagation with weight elimination learning of the multilayered neural networks, the LFC (lookahead feature construction) algorithm, and the Assistant-I and Assistant-R algorithms for t...
متن کاملTop-down induction of logical decision trees
Top-down induction of decison trees (TDIDT) is a very popular machine learning technique. Up till now, it has mainly used for propositional learning, but seldomly for relational learning or inductive logic programming. The main contribution of this paper is the introduction of logic decision trees, which make it possible to use TDIDT in inductive logic programming. An implementation of this top...
متن کامل